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Abstract 



The study of social networks is a burgeoning research area. However, most existing 
^0 I work deals with networks that simply encode whether relationships exist or not. In 

C/3 ' contrast, relationships in signed networks can be positive ("like", "trust") or negative 

O I ("dislike", "distrust"). The theory of social balance shows that signed networks tend 

to conform to some local patterns that, in turn, induce certain global characteristics. 
CSl ! In this paper, we exploit both local as well as global aspects of social balance theory 

^ ' for two fundamental problems in the analysis of signed networks: sign prediction and 

clustering. Motivated by local patterns of social balance, we first propose two fami- 
lies of sign prediction methods: measures of social imbalance (MOIs), and supervised 
ly-^ I learning using high order cycles (HOCs). These methods predict signs of edges based 

on triangles and ^-cycles for relatively small values of i. Interestingly, by examining 
[ measures of social imbalance, we show that the classic Katz measure, which is used 

■ widely in unsigned link prediction, actually has a balance theoretic interpretation 

when applied to signed networks. Furthermore, motivated by the global structure of 
balanced networks, we propose an effective low rank modeling approach for both sign 
_ prediction and clustering. For the low rank modeling approach, we provide theoretical 

' performance guarantees via convex relaxations, scale it up to large problem sizes using 

^ I a matrix factorization based algorithm, and provide extensive experimental validation 

" " " including comparisons with local approaches. Our experimental results indicate that, 

by adopting a more global viewpoint of balance structure, we get significant perfor- 
mance and computational gains in prediction and clustering tasks on signed networks. 
Our work therefore highlights the usefulness of the global aspect of balance theory for 
the analysis of signed networks. 



1 Introduction 

The study of networks is a highly interdisciphnary field that draws ideas and inspiration 
from multiple disciphnes including biology, computer science, economics, mathematics, 
physics, sociology, and statistics. In particular, social network analysis deals with networks 
that form between people. With roots in sociology, social network analysis has evolved 
considerably. Recently, a major force in its evolution has been the growing importance 
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of online social networks that were themselves enabled by the Internet and the World 
Wide Web. A natural result of the proliferation of online social networks has been the 
increased involvement in social network analysis of people from computer science, data 
mining, information studies, and machine learning. 

Traditionally, online social networks have been represented as graphs, with nodes rep- 
resenting entities, and edges representing relationships between entities. However, when a 
network has like/dislike, love/hate, respect/disrespect, or trust/distrust relatiships, such 
a representation is inadequate since it fails to encode the sign of a relationship. Recently, 
online networks where two opposite kinds of relationships can occur have become com- 
mon. For example, online review websites such as Epinions allow users to either like or 
dislike other people's reviews. Such networks can be modeled as signed networks, where 
edge weights can be either greater or less than 0, representing positive or negative rela- 
tionships respectively. The development of theory and algorithms for signed networks is 
an important research task that cannot be succesfully carried out by merely extending 
the theory and algorithms for unsigned networks in a straightforward way. First, many 
notions and algorithms for unsigned networks break down when edge weights are allowed 
to be negative. Second, there are some interesting theories that are applicable only to 
signed networks. 

Perhaps the most basic theory that is applicable to signed social net works but does 
not appear in the study of u nsigned networks is that of social balance Harary . 19531 . 



Cartwright and Hararvl . Il956l |. The theory of social balance states that relationships in 



friend-enemy networks tend to follow patterns such as "an enemy of my friend is my 
enemy" and "an enemy of my enemy is my friend" . A notion called weak balance David . 



1967l | further generalizes social balance by arguing that in many cases an enemy of one's 



enemy can indeed act as an enemy. Both balance and weak balance are defined in terms 
of local structure at the level of triangles. Interestingly, the local structure dictated by 
balance theory also leads to a special global structure of signed networks. We review the 
connection between local and global structure of balance signed networks in Section [2l 

Social balance has been shown to be useful for prediction and clustering tasks for 
signed networks. For instance, consider the sign prediction problem where the task is to 
predict the (unknown) sign of the relationship between two given entities. Ideas derived 
from local balance of signed networks can be succesfu lly used to yield algorithms for sign 
prediction [ Leskovec et al. . 2010a . Chiang et al. . 2011]. In addition, the clustering problem 
of partitioning the nodes of a graph into tightly knit clusters turns out to be intimately 
related to weak balance theory. We will see how a clustering into mutually antagonistic 
groups naturally emerges from weak balance theory (see Theorem [8] for more details) . 

The goal of this paper is to develop algorithms for prediction and clustering in signed 
networks by adopting the local to global perspective that is already present in the theory 
of social balance. What we find particularly exciting is that the local- global interplay 
that occurs in the theory of social balance also occurs in our algorithms. We hope to 
convince the reader that, even though the local and global definitions of social balance 
are theoretically equivalent, algorithmic and performance gains occur when a more global 
approach in algorithm design is adopted. 

We mentioned above that a key challenge in designing algorithms for signed networks 
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is that the existing algorithms for unsigned networks may not be easily adapted to the 
signed case. For example, it has been shown that spectral clustering algo rithms for un- 



signe d networks cannot, in general, be directly extended to signed networks Chiang et al 



2012l |. However, we do discover interesting connections between methods meant for un- 
signed networks and those meant for signed networks. For instance, in the context of sign 
prediction, we see that that the Katz measure, which is widely used for unsigned link pre- 
diction, actually has a justification as a sign prediction method in terms of balance theory. 
Similarly, methods based on low rank matrix completion can be motivated straight out of 
the global viewpoint of balance theory. Thus, we see that existing methods for unsigned 
network analysis can reappear in signed network analysis albeit due to different reasons. 
Here are the key contributions we make in this paper: 

• We provide a local to global perspective of the sign prediction problem, and demon- 
strate that our global methods are superior on synthetic as well as real-world data 
sets. 

• In particular, we propose three sign prediction methods based on (i) measures of 
social imbalance (MOIs), (ii) supervised learning using higher-order cycles (HOCs), 
and (iii) low-rank modeling. The methods using higher-order cycles are more global 
than existing methods that just use triangles, while the low-rank modeling approach 
can be viewed as a fully global approach motivated by global implications of struc- 
tural balance. 

• We show that the Katz measure used for unsigned networks can be interpreted from 
a social balance perspective: this immediately yields a sign prediction method. 

• We provide theoretical guarantees for sign prediction and signed network clustering 
of balanced signed networks, under mild conditions on their structure. 



Read ers can find the preliminary versions of this paper in Chiang et al.l . l201l[ | and Hsieh et al 



2012l|. The s i gn pr ediction methods based on p aths and cy c les w ere first proposed in 



Chiang et al.l . l201l[ |. and low-rank modeling in [H sieh et al.l . 120121 1. In this paper, we 



provide a more detailed and unifying treatment of our previous research; in particular, we 
provide a local-to-global perspective of the proposed methods, and a more comprehensive 
theoretical and experimental treatment. 

The organization of this paper is guided by the local versus global aspects of social 
balance theory. We first review some basics of signed networks and balance theory in 
Section [2j We recall notions such as (strong) balance and weak balance while emphasizing 
the connections between local and global structures of balanced signed networks. We will 
see that local balance structure is revealed by triads (triangles) and cycles, while global 
balance structure manifests itself as clusterability of the nodes in the network. Based on 
these observations, in Section [3l we start by showing how to use triads for sign prediction. 
In Section [H we go beyond triangles and explore prediction methods based on cycles of 
length up to ^ > 3. Under this broader view, we can exploit information that is less 
localized around an edge whose sign we have to predict. The hope is that going global 
should give us higher predictive accuracy. We propose two classes of methods: those based 
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on measures of social imbalance (MOIs) and those that use supervised learning techniques 
to exploit existence of balance at the level of high order cycles (HOCs). In Section EJ 
we present a completely non-local approach based on the global structure of balanced 
signed networks. We show that such networks have low rank adjacency matrices, so that 
we can solve the sign prediction problem by reducing it to a low rank matrix completion 
problem. Furthermore, the low rank modeling approach can also be used for the clustering 
of signed networks. In Section [6l we conduct several experiments, which show that global 
methods (based on low rank models) generally have better performance than local methods 
(based on triads and cycles) . Finally, we discuss related work in Section [3 and state our 
conclusions in Section [H 

2 Signed Networks and Social Balance 

In this section, we set up our notation for signed networks, review the basic notions of 
balance theory, and describe the two main tasks (sign prediction and clustering) that we 
address in this paper. 

2.1 Categories of Signed Networks 

The most basic kind of a signed network is a homogeneous signed network. Formally, a 
homogeneous signed network is represented as a graph with the adjacency matrix A E 
{—1,0, which denotes relationships between entities as follows: 



We should note that we treat a zero entry in A as an unknown relationship instead of 
no relationship, since we expect any two entities have some (hidden) positive or negative 
attitude toward each other even if the relationship itself might not be observed. From 
an alternative point of view, we can assume there exists an underlying complete signed 
network A*, which contains relationship information between all pairs of entities. However, 
we can only observe some partial entries of A* , denoted by Q. Thus, the partially observed 
network A can be represented as: 



A signed network can also be heterogeneous. In a heterogeneous signed network, there 
can be more than one kind of entity, and relationships between two, same or different, 
entities can be positive and negative. For example, in the online video sharing website 
Youtube, there are two kinds of entities - users and videos, and every user can either 
like or dislike a video. Therefore, the Youtube network can be seen as a bipartite signed 
network, in which all positive and negative links are between users and videos. 



1, if i Sz j have positive relationship, 
Aij = < — 1, if i & j have negative relationship, 

^0, if relationship between i & j is unknown (or missing). 





0, otherwise. 
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Balanced triads 



Unbalanced triads 




Table 1: Configurations of balanced and unbalanced triads. 



In this paper, we will focus our attention on homogeneous signed networks, i.e. net- 
works where relationships are between the same kind of entities. For heterogeneous signed 
networks, it is possible to do some preprocessing to reduce them to homogeneous networks. 
For instance, in a Youtube network, we could possibly infer the relationships between users 
based on their taste of videos. These preprocessing tasks, however, are not trivial. 

In the remaining part of the paper, we will use the term "network" as an abbreviation 
for "signed network", unless we explicitly specify otherwise. In addition, we will now 
mainly focus on undirected signed graphs (i.e. A is symmetric) unless we specify otherwise. 
For a directed signed network, a simple but sub-optimal way to apply our methods is by 
considering the symmetric network, sign(74 + A'^). 



2.2 Social Balance 



A key idea behind many methods that estimate a high dimensional complex object from 
limited data is the exploitation of structure. In the case of signed networks, re searchers 
have identified various kinds of non-trivial structure Hararv . 19531 . Davis . 1967]. In par- 
ticular, one influential theory, known as social balance theory, states that relationships 
between entities tend to be balanced. Formally, we say a triad (or a triangle) is balanced 
if it contains an even number of negative edges. This is in agreement with beliefs such as 
"a friend of my friend is more likely to be my friend" and "an enemy of my friend is more 
likely to be my enemy" . The configurations of balanced and unbalanced triads are shown 
in Table [TJ 

Though social balance specifies the patterns of triads, one can generalize the balance 
notion to general ^-cycles. An ^-cycle is defined as a simple path from some node to itself 
with length equal to i. The following definition extends social balance to general ^-cycles: 



Definition 1 (Balanced ^-cycles) An l-cycle is balanced iff it contains an even number 
of negative edges. 

Table [2] shows some instances of balanced and unbalanced cycles based on the above 
definition. To define balance for general networks, we first define the notion of balance for 
complete networks: 

Definition 2 (Balanced complete networks) A complete network is balanced iff all 
triads in the network are balanced. 

Of course, most real networks are not complete. In other words, we expect that there 
are always some missing entries in the adjacency matrix. That is, there exist i,j such that 
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Table 2: Some instances of balanced and unbalanced cycles. 



Aij = 0. To define balance for general networks, we adopt the perspective of a missing 
value estimation problem as follows: 

Definition 3 (Balanced networks) A (possibly incomplete) network is balanced iff it 
is possible to assign ±1 signs to all missing entries in the adjacency matrix, such that the 
resulting complete network is balanced. 

So far, the notion of balance is defined by specifying patterns of local structures in 
networks (i.e. the patterns of triads). The following result from balance theory shows 
that balanced networks actually have a nice global structure. 

Theorem 4 (Balance theory, Cartwright and Harary . 1956l |) A network is balanced 



iff either (i) all edges are positive, or (ii) we can divide nodes into two groups (or clusters), 
such that all edges within clusters are positive and all edges between clusters are negative. 

Now we can revisit the definition of balanced ^-cycles (Definition [1]). Under that 
definition, we can actually verify if a network is balanced or not by looking at all cycles 
in the network due to the following well-known theorem: 

Theorem 5 A network is balanced iff all its ^-cycles are balanced. 

Proof First we prove the forward direction. If we are given a balanced network, then we 
can divide the nodes into two antagonistic groups X, Y as Theorem d] shows (note that one 
of X,Y could be empty). Without loss of generality, given any £-cycle, we can traverse 
this cycle from an arbitrary node i £ X, and we will switch the group when passing a 
negative edge. After i steps we will stop at node i G X again; therefore, in these i steps 
we can only pass an even number of negative edges to ensure we stop at group X. Thus, 
any £-cycle in this balanced network is balanced. 

To prove the other direction, we give a procedure that partitions the network into two 
antagonistic groups (say X and Y) if all ^-cycles in the network are balanced. Without 
loss of generality we can assume the network has only one connected component. We first 
pick an arbitrary node i and mark it in group X, and try to mark the other nodes by 
performing a depth first search (DFS) from i. When we traverse an edge (u, v), we mark v 
as belonging to the same group as u if (u, v) is positive, otherwise we mark v as belonging 
to the opposite group as u. Since all cycles in the network are balanced, a node marked 
as X will not be marked Y later on when traversing cycles. Therefore, after all nodes are 
marked, we find two groups X, Y such that all edges within X or Y are positive and all 
edges between X and Y are negative. By Theorem U we conclude that this network is 
balanced. 
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2.3 Weak Balance 

One possible weakness of social balance theory is that the defined balance relationships 
might be too strict. In particular, researchers have argued that the degree of imbalance in 
the triad with two positive edges (the fourth triad in Table [T]) is much stronger than that 
in the triad with all negative edges (the third triad in Table [1]). Thus, we can say that 
the first three triads in Table [T] are weakly balanced. Based on this observation, by also 
allowing tria ds with all negative edges, a weaker version of balance notion can be defined 
bavigl . ll967l |. 



As in the case of (strong) social balance, we start with a definition of weak balance in 
a complete network: 

Definition 6 (Weakly balanced complete networks) A complete network is weakly 
balanced iff all triads in the network are weakly balanced. 

The definition for general incomplete networks can be obtained by adopting the perspective 
of a missing value estimation problem: 

Definition 7 (Weakly balanced networks) A (possibly incomplete) network is weakly 
balanced iff it is possible to obtain a weakly balanced complete network by filling the missing 
edges in its adjacency matrix. 

Though Definitions [6] and [7] define weak balance in terms of patterns of local triads, 
one can show that weakly balanced networks have a special global structure, analogous to 
Theorem HI 



Theorem 8 (Weak balance theory, DavisI 1967]) A complete network is weakly bal- 



anced iff either (i) all of its edges are positive, or (ii) we can divide nodes into k clusters, 
such that all the edges within clusters are positive and all the edges between clusters are 
negative. 

Thus, we say a network is /c-weakly balanced if its nodes can be divided into k clusters as 
specified in Theorem [8l Note that when k = 2, this theorem simply reduces to Theorem 

H 

2.4 Key Problems in Signed Networks 

As in classical social network analysis, we are interested in what we can infer given a signed 
network topology. In particular, we will focus on two core problems — sign prediction 
and clustering. 

In the sign prediction problem, we intend to infer the unknown relationship between 
a pair of entities i and j based on partial observations of the entire network of relation- 
ships. More specifically, if we assume that we are given a (usually incomplete) network 
A sampled from some underlying (complete) network A*, then the sign prediction task is 
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to recover the sign patterns of one or more edges in A*. Th is problem bears similarity to 
the s tructural link prediction problem in unsigned networks Liben-Nowell and Kleinberg . 



20071]. Note that the temporal link prediction problem has also been studied in the context 



of an unsigned network evolving in time. The input to the prediction algorithm then con- 
sists of a series of networks (snapshots) instead of a single network. We do not consider 
such temporal problems in this paper. 

Clustering is another important problem in network analysis. Recall that according 
to weak balance theory (Theorem [8|) , we can find k groups such that they are mutually 
antagonistic in a weakly balanced network. Motivated by this, the clustering task in 
a signed network is trying to identify k antagonistic groups in the network, such that 
most entities within the same cluster are friends while most entities belonging to different 
clusters are enemies. Notice that since this (weak) balance notion only applies to signed 
networks, most traditional clustering algorithms for unsigned networks cannot be directly 
applied. 



3 Local Methods: Exploiting Triads 

Since the basic definition of structural balance is in terms of triangles, a natural approach 
for designing sign prediction algorithms proceeds by reasoning locally in terms of unbal- 
anced triangles. We first define a measure of imbalance based on the number of unbalanced 
triangles in the graph, 

Htn{A) := 1 [a is unbalanced] , (1) 

where SCs{A) refers to the set of triangles in the network A. In general, we use SC£{A) to 
denote the set of all ^-cycles in the network A. A definition essenti ally similar to the one 



above appears in the recent work of van de Rijt van de Riitl . l201ll . p. 103] who observes 
that the equivalence 

^tri(^) = iff yl is balanced 

holds only for complete graphs. 

The basic idea of using a measure of imbalance for predicting the sign of a given query 
link {i,j}, such that i ^ j and Aij = is as follows. Given the observed graph A and 
query {i,j}, i ^ j, we construct two graphs: A'*"^*'-'^ and A~^^'^\ These are obtained from 
A by setting Aij to +1 and —1 respectively. Formally, these two augmented graphs can 
be defined as: 

= 1^' ('''^) = (^'-^'^ ^-(m) = /~^' (^'^) = (^'■^') (2) 

™ I A,,,, otherwise. ™ 1 A,,,, otherwise. 



Given a measure of imbalance, denoted as fi (•), the predicted sign of is simply: 

sign (/. (^-(*'^)) . (3) 
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Note that, to be able to do this quickly, we should use a n (•) for which the quantity 
([3]) is efficiently computable. For 3-cycles, this is particularly easy. For a given graph G 
and a test edge {i,j}, we are interested in computing the sign of: 

E - E iH 

(7gC3(a-('j)) (TeC3(A+(»j)) 

where we abuse notation by using the shorthand 1 [a] for 1 [a is unbalanced]. Somewhat 
surprisingly, this simply amounts to computing the entry in the matrix where 

A is the (signed) adjacency matrix of G. In fact, a more general result will be discussed 
below (see Lemma \TT\i . 

A method derived from a measure of imbalance relies on social balance theory for 
link prediction in signed networks. However, real world networks may not conform to the 
prediction of social balance theory or may do so only to a certain extent. To deal with 
this situation, we can use measures of imbalance to derive features that can then be fed to 
a supervised machine learning algorithm along with the signs of the known edges in the 
network. 



Indeed this is the approach pioneered by iLeskovec et al.l 2010al | . Their feature con- 
struction can be described as follows. Fix an edge e = (i, j). Consider an arbitary common 
neighbor (in an undirected sense) k oii and j. The link between i and k can be in 4 possible 
configurations: 

i ^ k i ^ k 

i ^ k i -(^ k . 

Similarly, there are 4 possible configurations for the link between k and j. Thus, we can 
get a total of 16 features for the edge e by considering the number of common neighbors 
k in each of the 4 x 4 = 16 configurations. 

This corresponds to a supervised variant of the fe-cycle method for k = 3. Let 
and be the matrices of positive and negative edges such that A = A~^ + A~ . In terms 
of matrix powers, these sixteen features are nothing but the entries in the sixteen 
matrices: 



A^'Y'A''^ (4) 



where 61,62 £ {i}) and {A''^)'^ denotes the transpose of A^^ . 

Note that we have described the features of a directed edge e = Social balance 

theory has mostly been concerned with undirected networks and hence methods based on 
measures of imbalance can deal with undirected networks only. When we learn weights 
for features that are motivated by balance theory, we are weakening our reliance on social 
balance theory but can therefore naturally deal with directed graphs. 
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4 Going Global: Exploiting Longer Cycles 

For an incomplete graph, imbalance might manifest itself only if we look at longer simple 
cycles. Accordingly, we define a higher-order analogue of ([1]), 

I 

:= ^ 1 [a is unbalanced] , (5) 

i=3 aesCi{A) 

where £ > 3 and /3j's are coefficients weighting the relative contributions of unbalanced 
simple cycles of different lengths. If we choose a decaying choice of /3j, like /3j = /3* for 
some P G (0, 1), then we can even define an infinite-order version, 

nl^{A) := ^^/Sj 1 [o' is unbalanced] . (6) 

i>3 aeSCi(A) 

It is clear that /ioo(") is a genuine measure of imbalance in the sense formalized by the 
following theorem. 

Theorem 9 Fix an observed graph A. Let f3i > be any sequence such that the infinite 
sum in Q is well-defined. Then, ^oo(^) > iff A is unbalanced. 

Proof This follows directly from Theorem O ■ 



This suggests that we could use /Xoo(-) as a measure of imbalance to derive sign predic- 
tion algorithms. However, enumerating simple cycles of a graph is NP-complete □. To get 
around this computational issue, we slightly change the definition of ^^(•) to the following. 

I 

^J-eiA) := ^^ft 1 [a is unbalanced] . (7) 

i=3 aGC\{A) 

As before, we allow i = oo provided the /3j's decay sufficiently rapidly. 

fioo{A) := 1 [<7 is unbalanced] . (8) 

j>3 o-GCi(A) 

The only difference between these definitions and ([5]), ([6]) is that here we sum over all 
cycles (denoted by Ci{A)), not just simple ones. However, we still get a valid notion of 
imbalance as stated by the following result. 

Theorem 10 Fix an observed graph A. Let (3i> Q be any sequence such that the infinite 
sum in ([8|) is well-defined. Then, ^oo{A) > Q iff A is unbalanced. 

Proof One direction is trivial. If A is unbalanced then there is an unbalanced simple 
cycle. However, any simple cycle is obviously a cycle and hence the sum in ([8]) will be 
strictly positive. 

By straightforward reduction to Hamiltonian cycle problem [KarDl . ll972l ]. 
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For the other direction, suppose /Xoo(^) > 0. This imphes there is an unbalanced cycle 
a in the graph. Decompose the unbalanced cycle into finitely many simple cycles. We will 
be done if we could show that one of these simple cycles has to be unbalanced. It is easy 
to see why this is true: if all of these simple cycles were balanced, they all would have had 
an even number of negative edges, but then the total number of negative edges in a could 
not have been odd. ■ 



4.1 Katz Measure Works for Signed Networks 



The classic method of Katz 



I953II h as been used successfully for unsigned link prediction 



Liben-Nowell and Kleinbergj . |2007| |. However, by considering a sign prediction method 
based on //oo(') we obtain an interesting interpretation of the Katz measure on a signed 
network from a balance theory viewpoint. The following result is the key to such an 
interpretation. 

Lemma 11 Fix A and let i ^ j be such that {i,j) ^ ft. Let and A~^^'^^ be the 

augmented graphs as defined in ([2]). Then, for any i > 3, 

(r6C<>(A-(*.j)) cr£Ce(A+(^'i)) 

Proof Define the sets of ^-cycles, 

C+(i,j) := {a G : a includes (z, j)} 

CiiiJ) := {a G : a includes , 

that include the edge {i,j). Note that, since ^+(*'-?) and A~^'^'^^ only differ in the sign of 
the edge {i,j), we have, 

Thus, we have, 

y: im- e ih 

= E 1M+ E iM- E iM- E 

(7ec-(i,i) <Tec,(A-(«>j))\C7(ij) <Tec+{jj) aeCe{A+(''i))\c+{i,j) 

= E 1 [^] - E 1 [^] • (9) 

Now cycles in C^{i,j) are in 1-1 correspondence with paths tt in "P^-i {i,j) of length 

in the original graph A, that go from i to j. Moreover, a £ C^{i,j) is unbalanced iff 
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the correspondmg path in Ve^i has an even number of —I's. Similarly, a G C^{i,j) 
is unbalanced iff the corresponding path in Ve-i [hi) has an odd number of —I's. Thus, 
continuing from Q: 

= 1 [tt has even no. of —I's] — 1 [vr has odd no. of —I's] 

= ' ^ii,«2 ■ ■ ■ ■ ■ ^ie-2,j 

n,i2,...,jf_2 

= (^"0 ■ ■ ' 

where the second equality is true because A only has ±1,0 entries. ■ 
Using Lemma [m it is easy to see that the prediction ^ using ([7]) reduces to 

sign (^,(^-(*'-')) - = sign (^5^/3*4/^ • 

Similar to the £-cycle case, the prediction ([3]) using ([8]) reduces to 

sign (/.oo(a-(*'^)) - /Uoo(a+(*'^))) = sign |^^/3^47'j (10) 
using Lemma [TTl 

Following the above reduction, the connection between Katz measure and ^oo(") stands 
out. This connection is stated as the following theorem: 

Theorem 12 (Balance Theory Interpretation of the Katz Measure) Consider the 
sign prediction rule ^ using /ioo(') in the reduced form p0|) . In the special case when 
[3i = f5^~^ with P small enough (f3 < 1/\\A\\2), the rule can be expressed as the Katz 
prediction rule for edge sign prediction, in closed form: 

sign(((/-/3^)-i-/-/3^)J . 
The Katz predi ction rule has been successfully us ed as a link prediction method for 



unsigned networks Liben-Nowell and Kleinbergl . 120071 ] but here we see it reappearing for 



link prediction in signed networks from a social balance point of view. We find this 
connection between Katz measure and social balance intriguing. 
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4.2 Learning the Weights 



As noted in Section O iLeskovec et alj [2010al | used triangle-based features to learn weights 



using a supervised learning method. A criticism against using only these triangle-based 
features is that there could be many people in the social network who do not share friends . 



In fact, this is the case in most of the networks that are used by lLeskovec et al.l [2010a |. 
The reason their method is able to predict well on such pairs is that they additionally 
use seven other "degree-type" features like in-degree and out-degree (and their signed 
variants). Thus, the prediction for an edge with zero emdeddedness (embeddedness refers 
to the number of common neighbors of the vertices of an edge) relies completely on the 
degree-based features. These degree features could possibly introduce a bias in learning. 
For example, a node that is predisposed to make positive relationships, biases the classifier 
to predict positive relationships. 

This criticism thus necessitates incorporating features from higher-order cycles. Gen- 
eralizing the construction (|4]), we can define 64 fourth-order features (corresponding to 
4-cycles in the graph) of an edge as the entries in the matrices: 



where bi G {it} indicates whether we look at the positive or negative part of A and 
ti € {T, 1} indicates whether or not we transpose it. There are 4 possibilities for each bi,ti 
pair, resulting in a total of 4 x 4 x 4 = 64 possibilities. 

By now the reader can guess the construction of features of a general order i > 3. For 
the edge they will be the entries in the 4^^^ matrices 

A'A''.(a'A'\...(a'^~A''-' , (12) 



a^A''.(a^A''.(a^A'' , (11) 



with hi £ {±},ti G {T,l}. 

Note that the number of features is exponential in i, and therefore it is not feasible to 
obtain features from arbitrarily long cycles. We use i < 5 for supervised HOC methods 
in our experiments that are presented in Section [6l 



4.3 Reducing the Number of Features 

The number of features can quickly become unmanageable, and computationally infeasible, 
as soon as i is beyond 5. While dimensionality of the feature space may be the primary 
concern, the combinatorial nature of the features also raises the following intuitive concern: 
the interpretability of features rendered by high-order cycles, say when i = 6, composed 
of different signs and directions, is a challenge. For example, it is intuitively hard to 
appreciate the difference between the two walks i ^ ki ^ k2 ^ j and 

i ^ ki ^ k2 ks ^ k4, ^ j. 

With this realization, one way to quickly reduce the number of features, yet retain the 
information in longer cycles, is to consider the underlying undirected graph, ignoring the 
directions. In particular, the order features will be from the matrices 

A''^ ■ A^K . . ■ A^'^-\ (13) 
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with bi G {±}- Note that since we are considering the undirected graph, we ensure that 
the features are symmetric by summing features of the form A^M^^ ^nd A'^^A'^K Thus 
the number of order features to compute is reduced to 0(2^) from 0(4^). Though the 
number of features is still exponential in i, the construction of features becomes easier for 
small values of i. 

We note that another way to avoid dealing with too many features is to use a kernel 
instead. A kernel computes inner products in feature space without explicitly constructing 
the feature map. One can then use off-the-shelf SVM classifiers to perform the classifica- 
tion. We leave this very promising approach of directly defining a kernel on pairs of nodes 
of a graph and using it for link prediction to future work. 



4.4 Classifier 

We use a simple logistic regression where the imbalance of an edge is modeled as a linear 
combination of the features, which are imbalances in cycles of various lengths and char- 
acteristics themselves. Let V be the set of vertices in the network and ^ : V x V ^ MP 
denote the feature map. Then, 

PiAj = +1) 



l + exp(-u'o- (w,$(z,j))) 



using which logistic regression is used to learn wq and the weight vector w = [wi ■ ■ 
MP. The prediction of any query (i, j) is then given by sign{P{Aij = +1) — 0.5). 



5 Fully Global: Low Rank Modeling 

In Section [H we have seen how to use ^-cycles for sign prediction. We have also seen that 
^-cycles play a major role in how balance structure manifests itself locally. By increasing £, 
the level at which balance structure is considered becomes less localized. Still, it is natural 
to ask whether we can design algorithms for signed networks by directly making use of 
their global structure. To be more specific, let us revisit the definition of complete weakly 
balanced networks (notice that balance is a special case of weak balance). In general, 
complete weakly balanced networks can be defined from either a local or a global point 
of view. From a local point of view, a given network is weakly balanced if all triads are 
weakly balanced, whereas from a global point of view, a network is weakly balanced if 
its global structure obeys the clusterability property stated in Theorem [8j Therefore, it is 
natural to ask whether we can directly use this global structure for sign prediction. In the 
sequel, we show that weakly balanced networks have a "low-rank" structure, so that the 
sign prediction problem can be formulated as a low rank matrix completion problem. 

We begin by showing that given a complete A;-weakly balanced network, its adjacency 
matrix A* has rank at most k: 

Theorem 13 (Low Rank Structure of Signed Networks) The adjacency matrix A* 
of a complete k-weakly balanced network has rank 1 if k < 2, and has rank k for all k > 2. 
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Proof Since A* is A:-weakly balanced, the nodes can be divided into k groups, say 



Suppose group S^''' contains nodes s 



1 5*2 1 



then the column 



vectors A 



1*1 



A* all have the following form (after suitable reordering of nodes): 



hi = [-1 




and so the column space of A* is spanned by {bi, . . . , b^}. 

First consider k < 2, i.e., the network is strongly balanced. If = 1, it is easy to see 
that rank(j4*) = 1. If /c = 2, then bi = — b2. Therefore, rank(74*) is again 1. 

Now consider k > 2. In this case, we argue that rank(A*) exactly equals k by showing 
that bi, . . . , bfc are linearly independent. We consider the following k x k square matrix: 



M 



It is obvious that 1 = [1 1 • • • 1]"^ is an eigenvector of M with eigenvalue —{k — 2). We can 
further construct the other k — 1 linearly independent eigenvectors, each with eigenvalue 2: 



ei - 62, ei - es, . . . , ei - e^, 

where ej S M'^ is the i*'^ column of the k x k identity matrix. These k — 1 eigenvectors are 
clearly linearly independent. Therefore, rank(M) = k. 

Prom the above we can show that rank(^*) = k. Suppose that bi, . . . , b^ are not lin- 
early independent, then there exists ai, . . . , Ok, with some Oi 7^ 0, such that Yli=i o^i^i = 
0. Using this set of a's, it is easy to see that Yli=i — 0) but this contradicts the 

fact that rank(Af) = k. Therefore, rank(74*) = k. ■ 
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Figure 1: An illustrative example of low-rank structure of a 3- weakly balanced network. 
The network can be represented as a product of two rank-3 matrices, and so the adjacency 
matrix has rank no more than 3. 



Figure [T] is an example of a complete 3-weakly balanced network. As shown, we see 
its adjacency matrix can be expressed as a product of two rank-3 matrices, indicating its 
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rank is no more than three. In fact, by Theorem 113^ we can conclude that its adjacency 
matrix has rank exactly 3. 

The above reasoning shows that (adjacency matrices of) complete weakly balanced 
networks have low rank. However, most real networks are not complete graphs. Recall 
that in order to define balance on incomplete networks, we try to fill in the unobserved 
or missing edges (relationships) so that balance is obtained (see Definition [7]) . Following 
this desideratum, we can think of sign prediction in signed networks as a low-rank matrix 
completion problem. Specifically, suppose we observe entries (z,j) E of a complete 
signed network A*. We want to find a complete matrix by assigning ±1 to every unknown 
entry, such that the resulting complete graph is weakly balanced and hence, the completed 
matrix is low rank. Thus, our missing value estimation problem can be formulated as: 

minimize rank(X) 

s.t. x,, = A'ij,y (14) 

Once we obtain the minimizer of (I14p . which we will denote by X* , we can infer the missing 
relationship between i and j by simply looking up the sign of the entry X*j . So the question 
is whether we can solve (I14p efficiently. In general, (I14p is known to be NP-hard; however, 
recent research has shown the surprising result that under certain conditions, the low-rank 
matrix completion problem ()14[) can be solved by coii vex optimization to yield a global 



optimum in polynomial time [Candes and Rechtl . l2008l | . In the following subsections, we 



identify such conditions as well as approaches to approximately solve p4|) for real-world 
signed networks. 

5.1 Sign Prediction via Convex Relaxation 

One possible approximate solution for ()14p can be obtained by dropping the discrete con- 
straints and replacing rank(X) by ||^||*, where \\X\\i, d enote s the trace norm of X, which 



is the tightest convex relaxation of rank Fazel et al.l . |200l|. Thus, a convex relaxation 
of (111 is: 



minimize \\X\\^: 

s.t. x,j=A^j,y {i,j)en. (15) 

It turns out that, under certain condition, by solving (jl5p we can recover the exact 
missing relationships from the underlyi ng complete signed network. This surprising result 



missing relationships trom the underlyi ng complete signed networ J c. Ihis surprising result 
is the consequence of recent research |Candes and R.chti M, ICandes and Tacj . 



which has shown that perfect recovery from the observations is possible if the observed 
entries are uniformly sampled and A* has high incoherence, which may be defined as 
follows: 

Definition 14 (Incoherence) An n x n matrix X with singular value decomposition 
X = UTiV'^ is ^-incoherent if 

maxlC/j, ! < and max 1 1^,- 1 < (16) 
i,3 \/n i,j Jn 
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Intuitively, higher incoherence (smaher /x) means that large entries of the matrix are 
not concentrated in a small part. The following theorem shows that under high incoherence 
and uniform sampling, solving (jlSp exactly recovers A* with high probability. 



Theorem 15 (Recovery Condition Candes and Tao . 2009l |) Let A* be an n x n 

matrix with rank k, with singular value decomposition A* = UT,V'^ . In addition, assume 
A* is ^-incoherent. Then there exists some constant C, such that if C fi'^nk'^ log^ n entries 
are uniformly sampled, then with probability at least 1 — n~^, A* is the unique optimizer 

of m- 



In particular, if the underlying matrix has bounded rank (i.e. k = 0(1)), the number 
of sampled entries required for recovery reduces to 0(/i^n log'^ n). 

Based on Theorem 1151 we now show that the notion of incoherence can be connected 
to the relative sizes of the clusters in signed networks. As a result, by solving (llSp . we will 
show that we can recover the underlying signed network with high probability if there are 
no extremely small groups. To start, we define the group imbalance of a signed network 
as follows: 

Definition 16 (Group Imbalance) Let A* be the adjacency matrix of a complete k- 
weakly balanced network with n nodes, and let ui, . . . ,nk be the sizes of the groups. Group 
imbalance r of A* is defined as 

T := max — . (17) 

i=l,...,k Hi 

By definition, k < t < n. Larger group imbalance r indicates the presence of a very small 
group, which would intuitively make recovery of the underlying network harder (under 
uniform sampling). For example, consider an extreme scenario that a fc-weakly balanced 
network contains n nodes, with two groups containing only one node each. Then the 
adjacency matrix of this network has group imbalance t = n with the following form: 

1 -1 -1 



^ - : ■•. -1 -1 

-1 ••• -1 1 -1 
-1 ••• -1 -1 1 

However, without observing or it is impossible to determine whether the 

last two nodes are in the same cluster, or whether each of them belongs to an individual 
cluster. When n is very large, the probability of observing one of these two entries will be 
extremely small. Therefore, under uniform sampling of O(nlog^n) entries, it is unlikely 
that any matrix completion algorithm will be able to exactly recover this network. 

Motivated by this example, we now analytically show that group imbalance r deter- 
mines the possibility of recovery. We first show the connection between r and incoherence 
/i. 

Theorem 17 (Incoherence of Signed Networks) Let A* be the adjacency matrix of 
a complete k-weakly balanced network with group imbalance r. Then A* is t -incoherent. 
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Proof Recall from Definition [T3] that /j, is defined as the maximum absolute value in the 
(normalized) singular vectors of A*, which are identical to its eigenvectors (up to signs), 
since A* is symmetric. 

Let u be any unit eigenvector of A* (||u||2 = 1) with eigenvalue A. Suppose i and j 
are in the same group, then the i^^ and j^^ rows of A* are identical, i.e., A*. = A'j.. As a 
result, the i^^ and j^^ elements of all eigenvectors will be identical (since Uj = A*.u/X = 
A* .u/X = Uj). Thus, u has the following form: 

u = [ ai,ai,j . . ,ai , a2, ■ -j ,Q2 _, ■ ■ ■ , Qfc, ■ - (18) 

"1 112 rik 

Because ||u||2 = 1, J2i=i''^i'^i — so riiaf < 1, Vi, which implies |aj| < 

Thus, 

. . . 1 ^/n/m ^ 

max I Uj I = max | a « | < max -—^ = max — — < —= . 

i i i ^Jrii i y/n -y/n 

Therefore, A* is r-incoherent. ■ 



Putting together Theorems [15] and \T7\ we now have the main theorem of this subsec- 
tion: 

Theorem 18 (Recovery Condition for Signed Networks) Suppose we observe edges 
^ij> {hi) £ from an underlying k-weakly balanced signed network A* with n nodes, and 
suppose that the following assumptions hold: 

A. k is bounded (k = 0{1)), 

B. the set of observed entries is uniformly sampled, and 

C. number of samples is sufficiently large, i.e. > Cr^nlog^ n, where r is the group 
imbalance of the underlying complete network A* . 

Then A* can be perfectly recovered by solving (jlSp . with probability at least 1 — n~^. 

In particular, if Tij/n is lower bounded so that r is a constant, then we only need 0{n log^ n) 
observed entries to exactly recover the complete /c-weakly balanced network. 



5.2 Sign Prediction via Singular Value Projection 



Though the convex optimization problem ()15p mentioned in Section 15.11 can be solved to 
yield the global optimum, the computational cost of solving it might be too prohibitive in 
practice. Theref ore, recerit research provid es more efficient algorithms to approximately 
solve (flil) [Cai et al.l . l20ld . ljain et al.l . l2010|. In particula r, we consider the Singular Value 
Projection (SVP) algorithm proposed bv lJain et al.l 201Cll | which attempts to solve the low- 
rank matrix completion problem in an efficient manner. The SVP algorithm considers a 
robust formulation of (1141) as follows: 



minimize ||'P(X) — AWj^ 
s.t. rank(X) < k, 



(19) 
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where the projection operator V is defined as: 



{nx))^ 



Xij, if (i, j) e 
0, otherwise. 



Note that the objective (jl9p recognizes that there might be some violations of weak bal- 
ance in the observations A, and minimizes the squared-error instead of trying to enforce 
exact equality as in (jlSp . In an attempt to optimize (jlOp . the SVP algorithm iteratively 
calculates the gradient descent update X^^^ of the current solution X^^\ and projects X^^^ 
onto the non-convex set of matrices whose rank < k using SVD. After the optimal X* 
of (|19p is derived, one can take the sign of each entry of X* to obtain an approximate 
solution of (I14p . The SVP procedure for sign prediction is summarized in Algorithm [TJ 



Algorithm 1: Sign Prediction via Singular Value Projection (SVP) 
Input: Adjacency matrix A, rank k, tolerance e, max iteration tmax, step size rj 
Output: X* , the completed low-rank matrix that approximately solves (I14p 

1. Initialize ^ and t ^ 0. 

2. Do 

• -??(p(a:W) -^) 

• [Uk, Sfc, Vfc] ^ Top k singular vectors and singular values of X^^^ 

• X(*+i) ^ Uk^kVk^ 
• t^t+l 

while ||P(XW) - A\\j, > € and t < tmi,^ 

3. X* ^ sign(XW) 



In addition to its efficiency, experimental evidence provided by Jain et al. 201Cll | 



sug- 
gests that if observations are uniformly distributed, then all iterates of the SVP algorithm 
are //-incoherent, and if this occurs, then it can be shown that the matrix completion 
problem (I14p can be exactly solved by SVP. In Section [6l we will see that SVP performs 
well in recovering weakly balanced networks. 



5.3 Sign Prediction via Matrix Factorization 

A classical limitation of both convex relaxation and SVP is that they require uniform 
sampling to ensure good performance. However, this assumption is violated in most real- 
life applications, and so these approaches do not work very well in practice. In addition, 
both methods cannot scale to very large datasets. Thus, we use a gradient based matrix 
factorization approach as an approximation to the signed network completion problem. 
In Section [6l we will see that this matrix factorization approach can boost the accuracy 
of estimation as well as scale to large real networks. 

In the matrix factorization approach, we consider the following problem: 

min V {Aij-{WH^)ijf + X\\Wfp + X\\Hfp. (20) 
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Although problem (j20p is non-convex, it is widely used in practical collaborative filter- 
ing applications as the performance is competitive or better as compared to trace-norm 
minimization, while scalability is much better. For example, to solve the Netflix problem, 
(j20p ha s been applied with a fair amount of success to factorize datasets with 100 million 



ratings [Koren et alj . l2009l | . 



Nevertheless, there is an issue when modeling signed networks using (l20]l : the squared 
loss in the first term of (j20p tends to force entries of WH^ to be either -|-1 or —1. However, 
what we care about in this completion task is the consistency between sigii[(W and 
sign(j4jj) rather than their difference. For example, {WH'^)ij = 10 should have zero loss 
when Aij = -|-1 if only the signs are important. 

To resolve this issue, instead of using the squared loss, we use a loss function that only 
penalizes the inconsistency in sign. More precisely, objective (j20p can be generalized as: 

min V £oss (^,j,(t^/7^)ij) + A||t^|||. + A||F|||.. (21) 

In order to penalize inconsistency of sign, we can change the loss function to be the sigmoid 
or squared-hinge loss: 

loss sigmoid(2;,y) = 1/(1 + exp(a;y)), 

(oSS square-hinge (x, y) = (max(0, 1 - Xy)f. (22) 

In Section [6l we will see that applying sigmoid or squared-hinge loss functions slightly 
improves prediction accuracy. 

5.4 Time Complexity of Sign Prediction Methods 

There are two main optimization techniques for solving (j2ip for larg e-scale data: Alter- 



nating Least Squares (ALS) and Stochastic Gradient Descent (SGD) [Koren et al.l . l2009l |. 
ALS solves the squared loss problem (|20p by alternately minimizing W and H. When 
one of W or H is fixed, the optimization problem becomes a least squares problem with 
respect to the other variable, so that we can use well developed least squares solvers to 
solve each subproblem. Given an n x n observed matrix with m observations, it requires 
0{mk'^) operations to form the Hessian matrices, and 0{nk^) operations to solve each 
least squares subproblem. Therefore, the time complexity of ALS is 0{ti{mk'^ + nk^)) 
where ti is the number of iterations. 

However, ALS can only be used when the loss function is the squared loss. To solve the 
general form ([2T]) with various loss functions, we use stochastic gradient descent (SGD). 
In SGD, for each iteration, we pick an observed entry (i, j) at random, and only update 
the i^^ row vfj of W and the j*^ row hj of H. The update rule for and hj is given 
by: 

wf ^ wf - „ ^^-^^""'-^ + Awf) . 

h7^hJ-,(^^^£ii%gl^ + Ahjj. (23) 



20 



HOC 


LR-ALS 


LR-SGD 


0(2^nm) 


0{ti{nk-^ + mk'^)) 


0{t2km) 



Table 3: Time complexity of cycle-based method (HOC) and low rank modeling methods 
(LR-ALS, LR-SGD). The HOC time only considers feature computation time. The time 
for low rank modeling consists of total model construction time. 



Algorithm 2: Clustering with Matrix Completion 
Input: Adjacency matrix A, number of clusters k 
Output: Cluster indicators 

1. X* ^ Completion(74) with any matrix completion algorithm. 

2. U -^r- Top k eigenvectors of X*. 

3. Run any feature-based clustering algorithm on U. 



where ?? is a small step size. Each SGD update costs 0{k) time, and the total cost of 
sweeping through all the entries is 0{mk). Therefore, the time complexity for SGD is 
0{t2mk), where t2 is the number of iterations taken by SGD to converge. Notice that 
although the complexity of SGD is linear in k, it usually takes many more iterations to 
converge compared with ALS, i.e., t2 > ^i- 

On the other hand, all cycle-based algorithms introduced in Section U] require time 
at least 0{nm), because they involve n x n sparse matrix multiplication steps in model 
construction. In particular, in case of the most effective cycle-based method HOC, for 
features with length i, the number of features is exponential in £ even if we reduce number 
of features by ignoring the directions (see Section 14.31 for details). Therefore, the time 
complexity for HOC methods will be 0{2^nm), which is much more expensive than both 
ALS and SGD as shown in Table [3] (note that in real large-scale social networks, m > n ^ 
ti,t2,k). 



5.5 Clustering Signed Networks 

In this section, we see how to take advantage of the low-rank structure of signed networks 
to find clusters. Based on weak balance theory, the general goal of clustering for signed 
graphs is to find a k-w&y partition such that most within-group edges are positive and most 
between-group edges are nega tive. One of the sta te-of-the-art algorithms for clustering 



signed networks, proposed bv iKunegis et al.l [2010(], extends spectral clustering by using 



the signed Laplacian matrix. Given a partially observed signed network A, the signed 
Laplacian L is defined as D — A, where Z) is a diagonal matrix such that Da = Ylj^i \^ij\- 
By this definition, the clustering of signed networks can be derived by computing the top 
k eigenvectors of L, say U G M"^'^, and subsequently running the /c- means algorithm on 
U to get the clusters. This procedure is analogous to the standard spectral clustering 
algorithm on unsigned graphs; the only difference being that the usual graph Laplacian is 
replaced by the signed Laplacian. 

However, there is no theoretical guarantee that the use of the signed Laplacian can 
recover the true groups in a weakly-balanced signed network. To overcome this theoretical 
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defect, we now give an algorithm which, under certain conditions, is able to recover the real 
structure even with partial observations. The key idea is as follows. Since, in Theorem I13|, 
we proved that A;-weakly balanced graphs have rank up to k, we can obtain good clustering 
by first running a matrix completion algorithm, say trace-norm minimization, on A. The 
following theorem shows that the eigenvectors of the completed matrix possess a desirable 
property. 

Theorem 19 Let Aij, E r^, be entries observed from a complete k-weakly balanced 

network A* with n nodes, and assume that the solution of (jlSp is X*' with eigenvectors 
U = [ui,U2,--- ,Ufc]. If the assumptions in Theorem [JM o.re all satisfied, then Ui^- = Uj^- 
iff i and j are in the same cluster in A* with probability at least 1 — n~^. 

Proof From Theorem \T8\ we know the recovered matrix X* will be A* with probability 
> 1 — if the assumptions hold. Suppose ui,...,Uj!; are the k eigenvectors of X*. 
From the proof of Theorem \T7\ the eigenvectors will have the form in (|18p , which means 
Ui^: = Uj^- if i and j are in the same cluster. Furthermore, when i and j are in different 
clusters, A*^. / A*., so Ui,- cannot equal Uj^-. This proves the theorem. ■ 

Following this theorem, the true clusters can be identified from the eigenvectors of X* 
when the assumptions in Theorem 1181 hold. Therefore, perfect clustering is guaranteed in 
this scenario. 

More generally, we can use any matrix completion method to complete A. For example, 
if we take SVP as the matrix completion approach, we can obtain a perfect clustering 
result if all iterates of the algorithm are ^-incoherent. Under the latter condition, SVP 
can recover A* exactly, so the property of eigenvectors in Theorem 1191 can again be used. 
Our clustering algorithm that uses matrix completion is summarized in Algorithm [2j 

It should not be surprising that our clustering algorithm is superior to (signed) spectral 
clustering. In some sense, our approach can be viewed as a spectral method, except that it 
first fills in the missing links from the training data by doing matrix completion. This step 
is simple yet crucial in signed networks as it overcomes the sparsity of the network. We 
will see that our clustering algorithm outperforms the (signed) spectral clustering method 
in Section m 



6 Experimental Results 

We now present experimental results for sign prediction and clustering using our proposed 
methods. For sign prediction, we show that local methods, such as MOI and HOC, yield 
better predictive accuracy if longer cycles are considered. In addition, if we consider 
the global low-rank structure of the network, prediction via matrix factorization further 
outperforms local methods in terms of both accuracy and running time. For clustering, we 
show that clustering via low rank model gives us better results than clustering via signed 
Laplacian. These results suggest the usefulness of the global perspective of social balance. 
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Table 4: Network Statistics 





# nodes 


# edges 


+ edges 


- edges 


Wikipedia 


7,065 


103,561 


78.8% 


21.2% 


Slashdot 


82,144 


549,202 


77.4% 


22.6% 


Epinions 


131,828 


840,799 


85.0% 


15.0% 



6.1 Description of Data Sets 

In our experiments, we consider both synthetic and real-hfe datasets. To construct syn- 
thetic networks, we first consider a complete A;- weakly balanced network A*, and sample 
some entries from A* to form the partially observed network A, with three controlling 
parameters: sparsity s, noise level e and sampling process D. The sparsity s controls the 
percentage of edges we sample from A*. The noise level e specifies the probability that 
the sign of a sampled edge is flipped. The sampling process D specifies how the sampled 
entries are distributed. In particular, we will focus on two sampling distributions: uni- 
form and power-law distribution, denoted as Puni and Ppow respectively. Thus, a partially 
observed network A can be described as ^ = A*(s, e,P). 

We also consider three real- life signed networks: Epinions, Slashdot, Wikipedia. Epin- 
ions is a consumer review network in which users can either trust or distrust other con- 
sumer's reviews. Slashdot is a discussion web site in which users can recognize others as 
friends or enemies. Wikipedia is a who- vote-to- whom network in which users can vote for 
or against others to be administrators in Wik ipedia. These three datasets have previousl y 



been used as benchmarks for sign prediction [Leskovec et al.l . l2010al . IChiang et al.l . l201l[ | . 



Table H] shows the statistics of these real signed networks. 



6.2 Evidence of Local and Global Patterns in Real Signed Networks 

We have seen that cycles in signed networks exhibit structural balance (see Definition 
[I]) according to balance theory, and that we can make use of cycles for predictions (see 
Sections [3] and S]). Ind eed, cycles tend to be b alanced in real-life networks. In all three real 



networks we consider, iLeskovec et al.l [2010bl | found that balanced triads (i.e. 3-cycles) are 



much more likely to be observed than unbalanced triads. Our study supports that the local 
patterns (i.e. ^-cycles) of the three networks tend to be balanced. For each network A, we 
consider all patterns of 3-cycles and 4-cycles in the symmetric network sign(74 -|- A"^). For 
convenience, we use Cu to denote the i*^ pattern of an £-cycle. Thus, we simply consider 
all Cii for £ = 3, 4 for each network. The patterns of these cycles are shown in Table [5j 
We first calculate the probability that the configuration of a given ^-cycle is Ca, denoted 
P{C£i). We then randomly shuffle the sign of edges in the network and calculate the same 
probability on the shuffled network, which is denoted Po(Cfi). Thus Po(C'fi) can be viewed 
as the (expected) probability that Cei is observed if the sign of edges has no particular 
pattern. With the two probabilities, we calculate the "surprise" of Cn that measures how 
significantly Cn appears more or less than expected. Formally, the surprise of Cn, denoted 



23 



Table 5: Statistics of balanced and unbalanced ^-cycles, i = 3,4 (notice that P{Cii) = 
Po{C£i) = 1 due to the property of probability). The first 6 cycles are balanced and the 
last 4 cycles are unbalanced. The last two rows show that balanced 3-cycles and 4-cycles 
are much more than expected. 
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S{C£i), is defined as: 

A,P(C«) - A,Po(Cfi) 



S{C, 



y'AePo{Ce,){l-Po{Cei))' 



where is number of ^-cycles in the network. The above quantity is basically the number 
of standard deviations that the observed value of Cfj differs from the expected value of 
Cii in the shuffled network. See Leskovec et al. . 2010bl ] for more discussions. 



Table [5] shows the real probability, the expected probability, and the surprise value 
of each Ca in three networks. From the surprise values, we observe that cycles with all 
positive edges (i.e. C31, C41) are far more than expected, and cycles with one negative 
edge (i.e. C32, C42) are far less than expected. The observations suggest that the number 
of both balanced/unbalanced patterns are significantly larger /smaller than expected when 
the cycle contains many positive edges. Readers might notice that some balanced cycles 
have large negative surprise values (for example, C44 in Epinions). However, in both real 
and shuffled networks, the fraction of such cycles are actually quite similar. The negative 
value of surprise is amplified by large number of observations of ^-cycles (for example, 
A4 = 6.85 X 10^). Furthermore, we also calculate these statistics on all balanced 3 and 
4-cycles as shown in the last two rows in Table [5l Both the difference between P{C) and 
Po{C) and the surprise value of balanced cycles are quite large. Overall, we find that local 
balanced patterns are somewhat significant. 

On the other hand, in Section [5l we have seen that low rank structure emerges when we 
theoretically examine weakly balanced networks. We now show that real networks tend 
to exhibit low-rank structure to a much greater extent compared to random networks. 
As a baseline, for each real network we create two corresponding random networks for 
compar ison: the first one is th e (symmetric) ER network generated from Erdos-Renyi 



model |Erdos and Renvil . Il960l ] that preserves the sparsity and the ratio of positive to 



negative edges of the compared real network. The second one is the shuffled network with 
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Figure 2: Relative error on 0,, the observed entries, between adjacency matrix and com- 
pleted matrix, for real-life networks versus random networks. Real-life networks achieve 
much smaller relative error for every k as compared with random networks. 



the same network structure as the compared real network, except that we randomly shuffle 
the sign of edges. 

The experiment is conducted as follows. We first derive the low-rank complete matrix 
X* by running matrix completion algorithm on the observed entries Aij. Then, we look 
at the relative error on the observed set il: 

\\Wo(X*-A)\\f 
errn = ^ ^— (24) 

II^IIf 

where Wij = 1 if (z, j) G and Wjj = otherwise, and o denotes element-wise multipli- 
cation. Clearly, smaller err^ indicates better approximation for the observed entries. 

In our experiment, we choose matrix factorization approach for matrix completion, 
with ranks A; = 1, 2, 4, 8, 16 and 32. For each network (real networks and their correspond- 
ing random networks), we complete the network with different k values and compute err^. 
The result is shown in Figure [2j Compared to the two random networks, the three real-life 
networks achieve much smaller err^j for each small k. This suggests that low-rank matrices 
provide a better approximation of the observed entries for real-life networks, as compared 
to random networks. 



6.3 Sign Prediction 

We now compare the performance of our proposed methods for sign prediction. As in- 
troduced in Sections [3] and HI there are two families of cycle-based methods: one based 
on measures of imbalance (MOT), and the other based on the supervised learning using 
higher order cycles (HOC). Both families depend on a parameter £ > 3 that denotes the 
order of the cycles that the method is based on. For MOI, we consider all i less than 10 as 
well as oo (recall that in this case MOI becomes Katz measure), and for HOC we consider 
£ = 3,4,5. Note that the set of features used by HOC-(^ + 1) is a strict superset of the 
features used by HOC-£. 

We also consider two fully global approaches for low rank matrix completion - Sin- 
gular Value Projection from Section 15.21 and matrix factorization from Section 15.31 The 
SVP approach (denoted as LR-SVP) is chosen to demostrate that perfect recovery can 
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be achieved if the observations are uniformly distributed. For matrix factorization, we 
consider the ALS method that solves problem (I20p . as well as SGD methods that solve 
the general problem (|2ip with sigmoid loss and square- hinge loss, defined in (j22p . We 
denote these methods as LR-ALS, LR-SIG and LR-SH, respectively. 



6.3.1 Synthetic Datasets 

We first compare all categories of approaches on synthetic datasets. We choose LR- 
SVP, LR-ALS, MOI-cxD and HOC-3 as representatives of the two approaches of low rank 
matrix completion, MOI-based, and HOC-based methods respectively. We consider the 
underlying network A* to be a complete 5- weakly balanced network, where the five clusters 
have sizes 100, 200, 300, 400 and 500. Instead of observing all of A*, we assume that we 
only observe a partial network A by sampling some entries from A* using three sampling 
procedures: uniform sampling, uniform sampling with noise, and sampling with power- 
law distribution. For each algorithm, we input the observed entries as training data and 
calculate the sign prediction accuracy on the rest of the entries. 

Uniform sampling: In this scenario, we generate several observed networks A = 
A*(s, 0,Pum)- We vary s from 0.001 to 0.1 and plot the prediction accuracy in Figure [3al 
Under this setting, LR-SVP and LR-ALS outperform the cycle-based methods. We observe 
that MOI-oo performs the worst with accuracy only 50%-70%. However, if we repeat the 
same experiment substituting A* with A2, where A2 is a complete strongly balanced 
network whose two groups have size 1000, we observe that MOI and global methods 
perform alike as shown in Figure iSbl This is because MOI uses cycle-based measurements 
to make more cycles become balanced. This prediction policy is most appropriate when 
k = 2 (that is, the underlying network A* has strong balance), but performs poorly when 
the underlying network is weakly balanced (i.e. more than two groups). HOC-3 works 
much better than MOI-c« since it learns a classifier from cycle-based features rather than 
simply making cycles balanced, but its accuracy drops dramatically when s is less than 
0.05. On the other hand, both LR-SVP and LR-ALS show high accuracy for all s > 0.01. 
In particular, LR-SVP can achieve 100% accuracy when s > 0.07, which reconfirms the 
theoretical recovery guarantee stated in Theorem [THJ Moreover, although LR-ALS has 
no theoretical guarantee, it can still recover the ground truth, an observation that is 
consistent with previous results. 

Uniform sampling with noise: To make the synthetic data more similar to real 
data, we further add noise into observations. We generate observed networks A = 
A* {0.1, e,I?uni)) where e varies from 0.01 to 0.25. The result is shown in Figure [3cl We can 
see that global methods are still clearly better than cycle-based methods when noise level 
becomes higher. Moreover, LR-SVP perfectly recovers A* when the noise level e < 0.05, 
and LR-ALS also achieves perfect recovery with a smaller e. 

Sampling with power- law distribution: As Sections 15. II and 15.21 stated, the exact 
recovery guarantees of convex relaxation and SVP for matrix completion crucially rely 
on the assumption that observed e ntries are uniformly sampled. However, in most real 
networks (for example, Slashdot in Kunegis et al. 20091 ]). the degree distribution of ob- 



served entries follows a power law. Therefore, we examine how the approaches perform 
on power-law distributed networks. The power-law distributed networks are generated 
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using the Chung-Lu-Vu (CLV) model proposed by IChung et alj [200j], which allows one 
to generate random graphs with arbitrary expected degree sequence. Similar to the uni- 
form sampling case, we perform the sign prediction task on ^ = A*(s, 0,Ppow) varying s 
from 0.001 to 0.1, and plot the prediction accuracy in Figure [3dl We can see that MOI-oo 
still has poor performance for weakly balanced graphs. However, unlike the uniform sam- 
pling case, LR-SVP has lower accuracy rate compared to HOC-3 when s < 0.1. On the 
other hand, LR-ALS still performs better than all other methods in power-law distributed 
graphs. 




Fraction of observed entries 

(a) Uniformly sampled without noise {k ■ 

5) 




-X- LR-SVP 
-A- LR-ALS 
-o MOI- 



10" 



10" 



Fraction of observed entries 

(b) Uniformly sampled without noise on 
balanced networks (k = 2) 




0.15 0.2 0.25 0.3 0.35 
Fraction of noisy entries 



0.4 



(c) Uniformly sampled with noise {k = 5) 




10 10 
Fraction of observed entries 

(d) Power-law distributed networks {k 

5) 



Figure 3: Sign prediction accuracy of local and global methods on synthetic datasets. On 
(strongly) balanced networks (I3b|) . MOI-oo is seen to perform as well as LR-SVP and 
LR-ALS. However, in general weakly balanced networks, global methods LR-SVP and 
LR-ALS outperform cycle-based methods such as MOI-oo and HOC-3. In addition, LR- 
ALS is more robust than LR-SVP when the observations are sampled from a power-law 
distribution. 



From results on synthetic data shown in Figure [3l we can conclude that global methods 
generally do better than local methods on synthetic setting, and the low rank model with 
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Figure 4: Accuracy of Measures of Imbalance (MOI) Based Methods for £ = 3,4, 5, 10. 
These plots show the accuracy of M01-£ methods for edges with embeddedness at least 
T for various thresholds T. We see that the difference in the performance of MOI-3 and 
higher order methods is larger when edges with lower embeddedness are considered. We 
also see that the improvement obtained by going beyond order 5 is not very significant. 



matrix factorization (LR-ALS) performs the best in most cases, even when observed entries 
are not uniformly distributed. 

6.3.2 Real-life Datasets 

Now we further evaluate our sign prediction methods on three real-life networks. To begin 
with, we evaluate and compare MOI methods using a leave-one-out type methodology: 
each edge in the network is successively removed and the method tries to predict the sign 
of that edge using the rest of the network. Figure H] shows the accuracy of MOI based 
methods. Note that the accuracy is shown for edges with embeddedness under a certain 
threshold. First, we see that the accuracy is a non-decreasing function of the embeddedness 
threshold. Next, it is clear that higher-order methods perform significantly better than 
MOI-3 (triangles) method. Finally, the performance boost is larger for edges with low 
embeddedness. This is expected as edges of low embeddedness by definition do not have 
many common neighbors for their end-points, and higher-order cycles have relatively better 
information for such edges than others. We also observe from our experiments that beyond 
£ = 5, the performance gain is not very significant. 

Next, we compare HOC methods. We resort to lO-fold cross-validation. To be more 
concrete, we (randomly) created 10 disjoint test folds each consisting of 10% of the total 
number of edges in the network. For each test fold, the remaining 90% of the edges serve 
as the training set. For a given test fold, the feature extraction and logistic model training 
happen on a graph with the test edges removed. To evaluate HOC methods, we consider 
not only prediction accuracies but also false-positive rates. We report both accuracies and 
false-positive rates by averaging them over the 10 folds. As shown in Table [H in all the 
datasets, there is a small improvement in accuracy by using higher order cycles (HOC- 
5). The false positive rate, however, reveals a more interesting phenomenon in Figure [5j 
Indeed, higher order methods (such as HOC-5) significantly reduce the false positive rate 
as compared to that of HOC-3. However Figure [5] shows that, unlike MOI based methods, 
edge embeddedness does not seem to affect the decrease in false positive rate for HOC 
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Figure 5: False Positive Rates of Higher Order Cycle (HOC) Methods for ^ = 3, 5. These 
plots show the false positive rate of HOC-£ methods for edges with embeddedness at least 
T for various thresholds T. We see that considering higher order cycles has the benefit of 
significantly reducing false-positives while simultaneously achieving slightly better overall 
accuracy (refer to Table [6]). However, unlike what we see for MOI methods, here the 
improvement does not seem to depend strongly on edge embeddedness. The false positive 
rates for HOC-4 are very similar to that of HOC-5 and hence are not shown for clarity. 

Table 6: The sign prediction accuracy for low rank modeling methods and cycle-based 
methods. We can see that the low rank modeling approaches are better than cycle-based 
methods. 
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0.5539 0.8497 
0.3697 0.7850 
0.7456 0.8220 
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0.8424 0.8605 
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0.8774 0.8789 0.8835 
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methods. We see this trend across all the datasets. 

At this point, we see that for cycle-based methods, considering higher order cycles 
benefits the accuracy of sign prediction and lowers the false positive rate. Furthermore, the 
results are consistent across the three diverse networks. These results confirm the intuition 
that getting more global information improves quality of prediction, and motivate us to 
consider the fully global structure of networks. 

Now we turn our attention to low rank modeling approaches. We have seen that 
LR-SVP fails to perform well under power-law distributions of observed relationships in 
synthetic networks (see Figure I3d|) , so we consider the more robust matrix factorization 
approach for solving the matrix completion problem, including LR-ALS, LR-SIG and LR- 
SH, for experiments on real datasets. Again, we use 10-fold cross validation setting, and 
report the average prediction accuracy for each dataset in Table [6l From the table, we 
observe that global methods clearly outperform cycle-based methods. In particular, we 
observe that HOC-5 only improves HOC-3 by less than 1.5%, while global methods con- 
sistently improve the accuracy of HOC-5 by more than 2% over all datasets. In addition, 
LR-SIG and LR-SH further improve the accuracy of LR-ALS. This shows that the sigmoid 
and square-hinge loss are more suitable for sign prediction, which supports the discussion 
in Section [531 
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In Figure [6l we further select a representative of each category, MOI-10, HOC-5 and 
LR-ALS, and show their performance with different levels of edge embeddedness (LR-SIG 
and LR-SH perform similar to LR-ALS among all datasets) . One might expect that cycle- 
based approaches should perform better on edges with higher embeddedness because more 
cycle information is available. However, surprisingly LR-ALS achieves higher prediction 
accuracy regardless of the embeddedness. All above results show that global methods are 
more effective than local methods. 
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Figure 6: Sign prediction accuracy of local and global methods with different levels of 
embeddedness. These plots show the accuracy for edges with embeddedness at least T. 
We can see that LR-ALS consistently achieves the highest accuracy for all thresholds T. 



6.3.3 Running Time Comparison 

In addition to prediction accuracy, we now compare the running time required by the dif- 
ferent methods. As discussed in Section [5.31 low rank modeling with matrix factorization 
is more efficient than cycle-based algorithms in terms of time complexity. Here, we fur- 
ther show that matrix factorization methods are empirically much faster than cycle-based 
algorithms. The running times are summarized in Table [71 To conduct timing tests on 
a large signed network, in addition to the three real datasets as described in Tabled we 
construct a large-scale synthetic dataset called ClusterlO where the number of edges is 
100 times more than Epinions. ClusterlO is generated from a 10-weakly balanced net- 
work, in which clusters have sizes 20000, 40000, . . . , 200000. There are totally 1.1 million 
nodes and 120 million edges uniformly sampled from the complete graph. We construct 
this synthetic data to show that our matrix factorization approach can easily scale up to 
massive graphs compared to HOC methods. For matrix factorization approach, we report 
the time needed to solve the model by SGD (with sigmoid and square-hinge) and ALS 
(with square loss). For HOC methods which build classifiers from cycle-based features, 
since the time for training phase depends on the classifier, we only report the time for 
computation of features. Thus the reported time for HOC is an underestimate of the time 
required to construct the HOC model; even then we can see that the time required by 
LR-ALS, LR-SIG and LR-SH is much lower than HOC methods. 

In conclusion, for the sign prediction problem, we see that considering fully global 
structure of networks gives us the best results. In particular, the low rank model with 
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Table 7: Running time (in seconds) for low rank model with matrix factorization and HOC 
on real datasets and a 1.1 million node synthetic data ClusterlO. The time of LR-SGD is 
the average time of LR-SIG and LR-SH. For HOC methods, we only consider the time for 
feature computation before the model training, while for LR methods we report the total 
time for constructing the model. We can see that LR methods with matrix factorization 
are clearly more efficient than cycle-based algorithms. 
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matrix factorization is clearly the most competitive method in terms of accuracy and 
scalability. 



6.4 Clustering 

In this subsection, we show that our clustering approach, which completes the low-rank 
structure of signed networks b efore performing clus tering, outperforms spectral clustering 



QgCiUSti 

based on the signed Laplacian [Kunegis et al.l . l2010l |. We conduct experiments on synthetic 



data generated from weakly balanced networks (note that we do not have ground truth for 
clustering in the real-life datasets). We consider a 10- weakly balanced network A* where 
size of each group is 100, and observe entries from A* with two sampling procedures: 
uniform sampling and uniform sampling with noise. 

To measure the performance of clustering, we calculate the number of edges that satisfy 
the ground-truth clustering, which is defined by 

^ I{s-i = s-,)+ I{s^^s-,). (25) 

where si, . . . , s„ denote the ground-truth clustering assignment for each node, and si, . . . ,Sn 
are the clustering results given by the clustering algorithm. 

Following the procedure outlined in the previous subsection, in the uniform sampling 
case, we consider the networks A = j4*(s, 0, Puni) with s G [0.01,0.06], while in sampling 
with noise case we consider networks A = A*(0.1, e, Puni) with e G [0.01,0.06]. For each 
observed network, we apply Algorithm [2] (See Section 15. 5p and clustering via the signed 
Laplacian, and evaluate clustering results using (j25p . The results of these two scenarios are 
shown in Figure [71 In both the scenarios, our proposed clustering approach is significantly 
better than clustering based on the signed Laplacian. This shows that recovering the 
low-rank structure of signed networks leads to improved clustering results. 



7 Related Work 

Signed networks have been studied since the early 1950s. Harary and Cartwright were the 
first to mathematically study structural balance. They defined balanced triads and proved 



31 




fraction of observed entries (r) fraction of noisy observations (r) 

(a) Data without noise (b) Data with noise 

Figure 7: Clustering partially observed synthetic data. Figure [7a] is the result without 
noise and Figure I7bl is the result with noise. In both cases, clustering with LR-SVP 
performs significantly better than clustering with signed Laplacian. 



the g lo bal structure of balanced signed networks Harary . 1953, Cartwright and Harary . 



1956( 1 . iDavisI [19671 ] further generalized the balance notion to weak balance by allowing 
triads with all negative edges, and showed that weakly balanced graphs have mutual 
antagonistic groups as global structure. 

Though theoretical studies of signed networks have been conducted for a long time, it 
was not until this decade that analysis of real signed networks could be done at a large scale 



as lar ge real networks have become more accessible rece ntly. For example. iKunegis et ah. 
20091 ] performed several analysis tasks on Slashdot, and lLeskovec et"al] |2010al b| studied 



the local and global structure of three real signed networks. They designed several com- 
putational experiments to justify that the structure of these signed networks match some 
widely believed social theories. 

In this paper, we focused on problems in signed networks. However, these problems 
have their counterparts in unsigned networks. For instance, structural link prediction in 
unsigned networks corresponds to the sign prediction problem. Structural link prediction 
has been wel l explored, and it is usually solved by computing a similarity measure be- 



twe en nodes Liben-Nowell a.nd Kleinberg . 2007], such as those proposed by Katz 19531 ] 



and Adamic and Adail 2003 ] . The sign predic tion problem, however, was not formally 



considered until the work by iGuha et al.[ [20041 ] , in which they develop a trust propaga- 



tion f r amew ork to predict trust or distrust between entities. More recently, iKunegis et al 



20091 . 12OIOI ] reconsidered this problem by using various similarity funct ions and kernels 



such as matrix exponential and signed Laplacian. iLeskovec et al.l 2010a| ] proposed a ma- 



chine learning formulation of this problem, arguing that learning from only local triangular 
structure of edges can achieve reasonable accuracy. 

Sign prediction using our global method is closely related to the low-rank matrix com- 
pletion problem. In the last five yea rs, there has been substantial research studying exact 



recovery conditions for this problem [Candes and Rechtl . 120081 . iCandes and Tad , 
algorithms with theoretical guarantees have also been proposed, such as SVT 



200911 ■ and 



Cai et al 
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2010 ] and SVP Jain et al. . 201Cll |. Matrix factorization is another approximation tech- 
nique for matrix completion. Though this appr oach is notoriously hard to analyze, it is 



very competitive in practice Koren et alj . |2009| | . While the matrix completion problem 



has been considered mostly in collaborative filtering, our low rank model arises naturally 
from the weak balance of signed networks. 

Clustering is another fundamental problem in network analysis. For unsigned net- 
works, there are several propos ed algorithms t hat have bee n shown to be ef fective, such as 
clustering vi a graph Lapla c ian Ng et al.l . 120011 ] , modularity Newmanl . |2006| | and multilevel 



approaches Dhillon et al.l . l2007l | . However, most of these approaches can not be directly 



extended to signed networks since weak balance theory does not apply to unsigned net 
works. As a result, researchers have tried to tailor unsigned network c lustering algorithms 



Doreian and Mrvar 



in order to make them applicable to signed networks. For instance, 

19 9^ proposed a local se arch strategy which is similar to the Kernighan-Lin algorithm 
(Kernighan and Lin . 1970l |. Starting with an initial clustering assignrn ent, it tries to move 



nodes one by one to get a more preferable clustering. lYang et al.l 1200711 proposed an agent- 



based method which basically conducts a random walk on the graph. iKunegis et al. 2O10l | 



generalized spectral algorithms to signed networks. They proposed a spectral approach 
using the so-called "signed" Laplacian, and showed that partitioning signed networks 
into two groups using the signed Laplacian kernel is analogous to considering ratio cut 
on unsigned networks. Anchuri and Magdon-Ismaill [2012 ] proposed hierarchical iterative 
methods t hat solve 2-way sign ed modularity objectives using spectral relaxation at each 
hierarchy. Chiang et al. 20121 ] proposed some graph kernels for signed network clustering, 
and showed that the multilevel framework can be extended to this problem. 

Another line of research on signed graph clustering problem is correlation clustering. 
Correlation clustering is motivated from document classification: given a set of documents 
with some pairs of documents labeled similar or different, the goal is to find a partition 
such that documents in the same cluster are mo stly similar Bansal et al.l . 120041 ] . The prob- 



lem was first considered bv lBansal et al.l 20041 ] . who proved that the problem is NP-hard 



to optimize, and proposed two approximation algorithms to maximize the "agreement" 
(defined as the number of edges that are correctly classified under a partition) and mini- 
mize the "disagreement" (defined vice versa) under the special case that all pairwise label 
inf ormation is given. T he bounds for general correlation clustering setting were provided 



by Dmaine et al. 2006| ]. On the other hand, some researchers have also considered the 
corr elation clustering problem fro m the statistical learning theory viewpoint. For exam- 



ple, I Joachims and HopcroftI 2005] give error boun ds for the problem if only partial pairs 
are observed. Recently, Cesa-Bianchi et al. 20121 ] proposed a method for sign prediction 
by learning a correlation clustering index. They consider three types of learning models: 
batch, online and active learning, and provide theoretical bounds for prediction mistakes 
under each setting. 



8 Conclusions and Future Work 



In this paper, we studied the usefulness of social balance on signed networks, with two 
fundamental applications: sign prediction and clustering. Starting from a local view 
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of social balance, we proposed two families of sign prediction methods based on local 
triads and cycles: prediction via measures of social imbalance (MOIs) and supervised 
learning based on high order cycles (HOCs). For both approaches, predictive accuracies 
are improved if longer cycles are taken into consideration, suggesting that a broader view 
of local patterns helps in sign prediction. We then considered the fully global perspective 
on social balance, and showed that the adjacency matrices of balanced networks are low 
rank. Based on this observation, we modeled the sign prediction problem as a low-rank 
matrix completion problem. We discussed three approaches to matrix completion: convex 
relaxation, singular value projection, and matrix factorization. In addition, we applied 
this low rank modeling technique to the clustering problem. In experiments, we observe 
that sign prediction via matrix factorization not only outperforms MOIs and HOCs, but 
requires much less running time. Clustering results are also more favorable via the matrix 
completion approach in comparison with the existing signed Laplacian approach. All of 
these results consistently demonstrate the effectiveness of the global viewpoint of social 
balance. 

For future work, one possible direction is to explore analysis tasks on heterogeneous 
signed networks. Since there are different types of entities in heterogeneous networks, 
currently there are no clear answers to questions such as: do balance relationships exist 
on such networks? How do we quantitatively measure balance if balance patterns exist? 
How is balance at a local level related to the global structure? Furthermore, another 
possible direction is to exan iine other theories for analysis tasks on signed networks. For 
example, some recent work Leskovec et al. . 2010al lb| has considered status theory. While 



Leskovec et al.l |2010al | found evidence that status theory holds in general in real signed 



networks, patterns conforming to status theory are quite different from those conforming 
to balance theory. Thus, it is natural to ask how to design algorithms by pursuing global 
patterns conforming to status theory. These interesting directions are worth exploring in 
future research. 
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